Query Auditing for Protecting Max/Min Values of Sensitive Attributes in Statistical Databases
نویسندگان
چکیده
In this paper, we define a novel setting for query auditing, where instead of detecting or preventing the disclosure of individual sensitive values, we want to detect or prevent the disclosure of aggregate values in the database. More specifically, we study the problem of detecting or preventing the disclosure of the maximum (minimum) value in the database, when the querier is allowed to issue average queries to the database. We propose efficient off-line and on-line query auditors for this problem in the full disclosure model, and an efficient simulatable on-line query auditor in the partial disclosure model.
منابع مشابه
Auditing Categorical SUM, MAX and MIN Queries
Auditing consists in logging answered queries and checking, each time that a new query is submitted, that no sensitive information is disclosed by combining responses to answered queries with the response to the current query. Such a method for controlling data disclosure naturally raises the following inference problem: Given a set Q of answered queries and a query q, is the information asked ...
متن کاملComputational complexity of auditing finite attributes in statistical databases
We study the computational complexity of auditing finite attributes in databases allowing statistical queries. Given a database that supports statistical queries, the auditing problem is to check whether an attribute can be completely determined or not from a given set of statistical information. Some restricted cases of this problem have been investigated earlier, e.g. the complexity of statis...
متن کاملPrivacy in Multidimensional Databases
Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. ABSTRACT When answering queries that ask for summary statistics, the query-system of a multidimensional database should guard confidential data, that is, it should avoid revealing (directly or indirectly) individual data, which could be exactly calc...
متن کاملEfficient Computation of Statistical Significance of Query Results in Databases
Queries such as database similarity searches return results satisfying certain properties of distances or scores. For domain scientists, the absolute values of scores are seldom sufficient. Statistical significance or p-value of the result is a more useful criterion. This can be computed using an appropriate model of random objects. The problem of computing p-values becomes more acute when quer...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کامل